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(54) System and method for automatically adding informational hypertext links to received 
documents 

(57) In a distributed computer system, an auto- 
mated document annotation system and method adds 
hypertext cross-references to a set of known informa- 
tion sources into documents requested by a client com- 
puter in such a way that the merged document is 
displayable by existing Web browsers. The distributed 
computer network incorporates a pluralrty of senders to 
store documents. Each stored document has a unique 
document identifier and is viewable from a client compu- 
ter having a browser configured to request and receive 
documents over the network. An annotation proxy, 
which is a software procedure configured to merge a 
requested document from a first server with hypertext 
links to documents containing associated supplemental 
information. The set of hypertext links and criteria for 
identifying where such links should be added to 
requested documents are defined by one or more dic- 
tionaries of cross-references. The annotation proxy then 
relays the merged document to a receiver unit that is 
selected from another proxy, such as a firewall proxy or 
another annotation overlay proxy, or the browser, which 
ultimately displays the merged document. The annota- 
tion proxy optionally includes a dictionary generator that 
generates a dictionary of references to documents 
requested by the user, each reference in the dictionary 
indicating the textual context of the hypertext link or 
^ links used to request the associated document. The 
^ generated dictionary represents information sources 
^ known and used by the user. The annotation proxy then 
CO annotates requested documents with cross-references 
^ in the dictionary that was generated by the annotation 
CO proxy. 
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Description 

The present invention relates generally to computer networks, and particularly to proxy servers used to supplement 
the information found in documents stored on computer netvsrorks. 

BACKGROUND OF THE INVEfvmON 

The World-Wide Web CWAAAT) links many of the sen/ers making up the Intemet. each storing documents identi- 
fied by unique universal resource locators (URLs). Many of the documents stored on Web servers are written in a stand- 
ard document description language called HTML (hypertext markup language). Using hfTML. a designer of Web 
documents can associate hypertext links or annotations with specific words or phrases in a document (these hypertext 
links identify the URLs of other Web documents or other parts of the same document providing information related to 
the words or phrases) and specify visual aspects arxi the content of a Web page. 

A user accesses documents stored on the WWW using a Web browser (a computer program designed to display 
HTML documents and communicate with Web servers) running on a Web client connected to the Internet. Typically this 
is done by the user selecting a hypertext link (typically displayed by the Web browser as a highlighted word or phrase) 
v^thin a document being viewed with the Web browser. The Web browser then issues a HTTP (hypertext transfer pro- 
tocol) request for the requested document to the Web server identified by the requested document's URL In response, 
the designated Web server returns the requested document to the Web browser, also using the HTTP 

Many entities, especially corporations that allow access from corporate systems to the Web, modify this document 
access process by providing a firewall proxy running on a proxy server situated between the Web clie nt ru nning the 
browser and the various Web servers hosting the requested documents. In this modified situation, all HTTP requests 
issued by the browser and all documents returned by the Web servers simply routed through the firewall proxy, which 
implements a proxy server communicatiorte protocol that is a subset of the HTTP Apart from providing a buffer between 
the Web client and senders, and preventing the client from receiving messages which violate certain security criteria, a 
pure firewall proxy performs no additional operations on the transfenred information. Another common type of firewall 
proxy is a caching firewall proxy, which caches requested documents to provide faster subsequent access to those doc- 
uments. . 

The ease of access and page design provided by the Web has proved attractive to many types of uses; e.g.. indi- 
viduals and corporations, who have not traditionally used the Internet. Additionally, the WWW is Increasingly being used 
for commercial purposes, such as advertising and sales. Together, the new users and new uses mean that an informa- 
tion explosion is occurring on the Web. With this information explosion H is becoming increasingly important that Web 
users be able to supplement the HYPERTEXT LINKS in Web documents with additional HYPERTEXT LINKS to addi- 
tional information resources. For exannple. a Web user may have previously located a set of Web pages at a number of 
remote sites that relate to a particular field of interest (e,g.. a particular field of engineering, science, music, etc.). The 
user may wish to provide additional references wrthin a received Web document to this previously located set of Web 
pages by annotating the received Web document with HYPERTEXT LINKS to these Web pages. 

Embodiments of the present invention provide a system and method for automatically annotating a received a doc- 
ument so as to interconnect that document via HYPERTEXT LINKS to a set of documents known to contain supple- 
mental infomnation related to the topic of the received document. 

In embodiments of the present Invention, the annotation system and method are implemented in a manner that is 
compatible with existing Web browsers and the HTTP 

One system that uses a proxy server to dynamically modify received documents is the Open Software Foundation's 
World Wide Web Agent Toolkit, or OreO. OreO allows users to build personal agents that can perform filtering functions 
on requested documents before they are viewed using the Web browser. The agents created with OreO can be used in 
pipeline anywhere between a traditional Web client (i.e.. Web browser) and a Web server to perform more complex and 
varied filtering of Web transactions. For example, a user could connect an otsscenity filter in series with a violence filter 
to ensure appropriate Web browsing for their children. OreO makes this pipelining possible by providing agent inter- 
faces that make each agent look like a traditional Web client on one side and a proxy server on the other. 

However, because the OreO toolkit does not address the creation of dictionaries or libraries of supplemental mate- 
rials, OreO agents are not well-suited to merge cross-references to supplemartal nnaterials from sources other than the 
creator of a requested document with the requested document. Moreover. OreO agents can only perform filtering by 
parsing all requested documents looking for occurrences of certain key phrases or patterns then deleting or replacing 
those key phrases or patterns. 

Therefore, there is a need for a system that introduces a proxy server between Web servers and dients that allows 
parts of requested documents to be annotated with hyper-link cross-references to stpplemental materials before the 
documents are viewed with a Web browser. Unlike the OreO agent, this system should perform the aforementioned 
annotating based on sources of supplemental materials associated with Web senders that might be completely unre- 
lated to the author of the requested document. Ideally, a user should be able to indicate to the proxy server a set of well 
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established dictionaries, directories, or libraries of information sources for which cross-references should be merged 
into received documents. Then, when the user requests a document, that request should be relayed through the proxy, 
which merges the requested document with cross-references to the user-specified supplemental information sources. 
The resulting merged document should be viewable with any existing Web browser. 
5 Alternatively, the system should allow a user of the proxy to direct the proxy to generate and add to a dictionary of 

cross-references annotations from sources accessed by the user over a period time. Then, when a user requests adoc- 
ument. the proxy should be able to merge cross-references in the dictionary with the requested document, eliminating 
the need to search the Web for the appropriate supplemental materials. 

70 SUMMARY OF THE INVENTION 

In summary, the present invention is a system and method for merging hypertext cross-references to a set of known 
information sources with documents requested over the Web in such a way that the merged document is displayable by 
existing Web browsers. ... ^ ^. . . 

Specifically, the present invention provides a system and method for providing hypertext link annotations for docu- 
ments requested over a distributed computer network that incorporates a plurality of servers to store the documents. 
Each stored document has a unique document identifier and is viewable from a client computer having a browser con- 
figured to request and receive documents over the network. 

Another feature of the present invention is an annotation proxy, which is a software procedure configured to merge 
a requested document from a first server with hypertext links to documents containing associated supplemental infor- 
mation where the set of hypertext links and criteria for identifying where such links should be added to requested doc- 
uments are defined by one or more dictionaries of cross-references. The annotation proxy then relays the merged 
document to a receiver unit that is selected from another proxy (possibly a firewall proxy or another annotation overlay 
proxy) or the browser, which ultimately displays the merged document 

In a preferred embodiment the annotation proxy can generate a dicttonary of references to documents requested 
by the user each reference in the dictionary indicating the textual context of the hypertext link or links used to request 
the associated document. The generated dictionary thus represents infonnation sources known and used by the user. 
The annotation proxy can then annotate requested documents with cross-references in the dictionary that was gener- 
ated by the annotation proxy. ^ 

The present invention is also a method usable in the same type of computer network for providing hypertext link 
annotations for a requested document As a first step, at least one dictionary of hypertext links to supplemental docu- 
ments is stored. A itierged document is then formed by merging a requested document stored on a first server with 
hypertext link annotations from the dictionary when the text or other content in the document matches corresponding 
merge criteria. This merged document is then relayed to a receiver selected from another proxy or said browser. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

Examples of the invention viii be described in conjunction with the drawings, in which:- 

40 FIG. 1 is a block diagram of a distributed computer system incorporating the present invention. 

FIG 2 is a block diagram of a preferred embodiment of the present invention, showing the relationship between a 
web client, a web server, and an annotation proxy server agent interposed between the web client and the web 
server. 

45 

FIG. 3 is an illustration of an exemplary annotation directory showing the contents oi a cross reference source field 
and match patt^-n field. 

FIG. 4 is an illustration of the manner in which an annotation in the form of a hypertext link to a specified URL is 
so added to a portion of a document. 

FIG. 5 is an illustration of an exemplary annotation directory of an alternative en±K)diment of the invention showing 
the contents of a cross reference source field, a match pattern field, and a relevance index field. 



55 DESCRIPTION OF THE PREFERRED EMBODIMENTS 

Referring to FIG. 1. there is shown a distributed computer system 100 having rrany client computers 102 and at 
least one remotely located information server computer 104. In the preferred embodiment, each client computer 102 is 
connected to the information server 1 04 via the Internet 106. arthough other types of communication connections could 
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be used. \Aftiile most client computers are desktop computers, such as Sun workstations. IBM compatible computers 
and Macintosh computers, virtuafly any type of computer can be a client computer. 

In the prefen-ed embodiment, each client computer 102 includes a communications interface 103 for communicat- 
ing with the information server 104 and/or a remote annotation proxy server 119 (if provided). RAM 105, a CPU 106. a 

5 user interface 1 07. and memory 1 08 for storing an operating system 1 09. a World Wide Web browser program 1 1 0. at 
least one cross reference dictionary or directory (Xref Directory 1) 112 and/or a URL pointer 114 to a cross reference 
directory (Xref Directory 2) located on a remotely located computer, a cross reference directory generator procedure 
1 16, and an annotation proxy sender procedure 118. Note in the context of annotation proxy servers, the term directory 
as in annotation directory is synonymous with dictionary. 

10 While in the prefen-ed embodiment the annotation proxy server (Annotation Proxy Server A) 1 1 8 is executed on the 
same hardware platform as the user's Web browser 1 10, the annotation proxy server 118 could also be executed on 
another linked computer. In fact, multiple annotation proxy senders 1 18. 119 may be provided on network 100 and the 
user may select the most appropriate annotation proxy server for the document requested. For example, in an alterna- 
tive embodiment of the invention, annotation proxy server 119 may be provided instead of or in addition to annotation 

15 proxy server 118. For a remotely located proxy server 1 19. the client computer 102 requests a document (e.g. Doc 1) 
from Information server 104 with instructions to forward the document to proxy server 1 19. The document is annotated 
upon receipt by the proxy server and tiien retransmitted to the requesting dient over network 100. 

In either embodiment, the annotation proxy server 1 1 8 indudes a document merger procedure 1 22 which performs 
document parang and annotation, one or more cross reference (Xref) directories 124. and an Internet communications 

20 manager 120. When the proxy server is resident on the same hardware as the dient computer, communications inter- 
face 103 may be incorporated into the Internet communications manager. 

The information server 104 indudes a central processing unit (CPU) 150. primary memory 152 (i.e.. fast random 
access memory) and secondary memory 1 54 (typically disk storage), a user interface 1 56. a communications interface 
158 for communication with the dient conputers 102 via the communications network 106. For the purposes of tiie 

25 present discussion, it will be assumed that each information server's secondary memory 1 54 stores: an operating sys- 
tem 160, a Web server procedure 162. and document files 164. 166, 168. 

Referring to FIG. 2. there is shown a block diagram of an embodiment of the inventive system showing the relation- 
ship between a web dient conputer 102. a plurality of web infomnation servers 104. and an annotation proxy server 118 
interposed between one of the web dient computer 102 and the web information server 104. In the embodiment illus- 

30 trated in FIG. 2. server 104a stores a document (Doc 1) 169 in document storage 1 80. server 104b stores a pluralrty of 
documents (Doc 2. Doc 3. Doc 4) 164. 166. 167 in document storage 182. and server 104c stores a plurality of docu- 
ments (Doc 5. Doc* 6. Doc 7) 1 71 . 1 72. 1 73 in document storage 1 84. Each web server 1 04a. 1 04b. and 1 04c have the 
characteristics of information server 104 as already described relative to FIG. 1 . 

In the prefen-ed embodiment, annotation proxy server 1 18 is located on the same platfomi as the client conputer 1 02; 

35 however, the annotation proxy server 1 1 8 may alternatively be located on a conputer different from the dient computer 
1 02 on which the document request was initiated or on a web sender 1 04 different from that on which tiie requested doc- 
ument originally resides. Each document is identifiable by a unique document identifier The document identifier may 
indude a first location identifier data that identifies the location of the document as a particular web server location 
(such as a URL reference to the Web site) on the distributed computer system 100. and may further include a second 

40 document identifier data that identifies tiie document within that particular web server site, such as a name. The docu- 
ment need not actually contain or store the document identifier so long as the network 1 00 induding server 104 provide 
means for locating and addressing each document. For exanple. a file management system on server 104 may provide 
file addressing capability once tiie request for a document has been routed from the dient conputer to the server 1 04 
storing the requested document In general, a requested document and any cross-referenced documents can be on tiie 

45 same or different servers 104. at any Web sites anywhere. 

Each annotation proxy sender (APS) 1 1 8. 1 1 9 indudes one or more annotation directory 1 91 . 1 92. Each annotation 
diredory is uniquely dentifiable. such as by name or number so ttiat a user associated with a client conputer 102 may 
seled the desired annotation diredory from among several that may be present on the proxy sender 118. 119. Each 
annotation directory 191. 192 indudes a plurality of paired entries (e.g. 191a. ^ * ' b. 191c. I91d, 191e; and 192a. 192b. 

so 192c. 19ed) where each paired entry indudes a cross reference document sc.: je field 194 and a match pattern field 
195. Each cross-reference source field 194 identifies the unique location of a cross reference document, and each 
match pattern field 195 defines a character pattern (induding symbols, words, characters, phrases, numbers, and tiie 
like). If the character pattern is found in a requested document, that indicates that an annotation linking the portion of 
the document assodated witti the matching pattern to ttie paired aoss reference source should be added to ttie 

55 requested document For example, if match pattern 3 in annotation directory 191 is the phrase "JAVA!" and the paired 
cross-reference source 3 is SUN.COM. J AVAINFO. then a hypertink annotation "(link to SUN.COM.JAVAINFO)" will be 
added to the requested document in assodation with the "JAVA!" phrase pattern. Other fields may optionally be pro- 
vided in the directory, such as an optional relevance indicator field 196 to indicate tiie relevance or importance of tiie 
associated match pattern 195 or cross-reference source 194. The optional use of relevance information is descrit>ed in 
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greater detail hereinafter. 

When web client 102 requests a document such as document "Doc3" 166 stored in document storage 182 located 
on web server 104b using web browser 1 10, the user associated with client computer 102 also specifies an annotation 
proxy server 118. and one of the annotation directories 191 . 192 provided on that server. If the annotation proxy server 
5 1 1 8 has only a single annotation directory, such as when the proxy server is resident on the client computer making the 
request and the user has provided an annotation directory for use on all requested documents, then explicit specifica- 
tion of the directory may be unnecessary. Furthermore, in the preferred embodiment the user may specify an annotation 
proxy and set of annotation directories to be used for annotating all future document requests until the user specifies a 
different annotation proxy and/or set of annotation directories. 
10 Further, the specification of a particular annotation proxy server 1 18 may either be specified by an explicit com- 
mand from the client 102 at the time the document is requested or inrplicitly specified, such as using the proxy server 
1 18 resident on the client computer as a default if no other proxy server is specified, or based on characteristics of the 
requested document, user history, or other user preferences. When explicit specification of a proxy is required or 
desired, the user associated with the client computer may specify a particular annotation proxy server 1 18 and annota- 
15 tion directory by clicking one or more buttons on the client web page, or by entering an annotation proxy server identifier 
(such as by entering a proxy server name or URL) and an annotation proxy directory name or URL. 

A document request on the client computer 102 ultimately results in receipt of a version of the document which is 
annotated with cross references in accordance with the selected annotation proxy sever and annotation directory. The 
specific commands generated and command and data pathways on the network 1 00 will depend somewhat on the loca- 
te tions of the requesting client 102, information server 104 storing the requested document, and the annotation proxy 
sen/er 1 18. In particular, the command and data pathways will depend on whether the proxy server 1 18 is resident on 
the requesting client computer 1 02, resident on the same information sender 104 that is providing the requested docu- 
ment, or provided by a separate annotation proxy computer site on the network. 

In one embodiment where the annotation proxy server 118 is provided on the requesting client computer 102, the 
25 document request command 201 (which may include a requesting client computer identifier, a unique document iden- 
tifier for the requested document an identifier for the proxy sen/er that will annotate the document, and an annotation 
directory identifier when applicable) is routed internally to the proxy sen/er 1 18 which in turn transmits a request to the 
server 104 for the document using the unique document identifier and the requesting computer identifier. Information 
sen/er 104 provides the requested document to the proxy server 118 which applies the identified annotation directory 
30 to the received document and provides tine merged document to ttie browser 1 1 0 for viewing on tiie requesting client 
computer 102. 

Once ttie request for document is received and recognized by the web sen/er on which tine requested document is 
stored, the web server prepares the document and transmits tiie document to the annotation proxy server 118 (which 
may be ttie same or a different conputer from the requesting client computer) for annotation. H the annotation is per- 

35 formed on a remote proxy sen/er 1 18. tiien annotation is performed prior to transmission of the document to tiie client 
102, in a corwentional manner. 

In a different embodiment, ttie requesting conputer may receive the unannotated document, retransmit it to any 
desired annotation proxy sender and then receive tiie annotated document back from the proxy server after annotation. 
However, such a system and method are operable they are less efficient. 

40 The manner of annotating a document are now described witii reference to FIG. 3 . The annotation proxy sen/er 1 1 8 
includes a set of hypertext linking rules or document merger procedures 122 for adding annotations, such as in the form 
of hypertext links, to a requested document. In simplest terms, the annotation proxy server parses tiie requested doc- 
ument and compares ttie characters, words, phrases, and ttie like witii match patterns 195 in the selected annotation 
directory. Various search strategies and search engines for performing such comparisons are known in tiie art and are 

45 not discussed further. When a pattern identified in tiie designated annotation cfirectory 191. 192 is present in tiie 
requested document an annotation is performed by adding to tiie requested document one or more cross references to 
the document associated with the identified pattern. 

For example, witii reference to FIG. 2. two exemplary annotation directories 191 . 192 are shown. Each annotation 
directory 191. 192 includes a plurality of paired enti-ies (e.g. 191a, 191b. 191c, 191d. 191e: and 192a, 192b, 192c. 

50 192d) where each paired errtry includes a cross reference document source field 194 and a match pattern field 195. 
Each cross-reference source field 194 identifies the unique location of a cross reference document, and each match 
pattern field 1 95 defines a character pattern (including symtKJIs. words, characters, phrases, numbers, and tiie like) tiiat 
defines where annotation hyperlinks to ttie cross reference document should be added to requested documents. 

In reference to FIG. 3. there is shown a more specific example of entries in an annotation directory. Here, tiie entry 

55 URLX1 con-esponds to tiie generic entry Xref Source 1 . and the enfry "music synttiesi*" w/10 "signal process*" con-e- 
sponds to th generic entry match pattern 1 of annotation directory 191 of FIG. 2. The in tiie match pattern indicates 
a so called "wild card" character or characters which stand for no characters or one or more characters at that position 
in the text. Use of such wild card characters are known in conventional search techniques and not discussed furttier. In 
this example, whenever ttie text string "music synthesi*" appears within 10 words of the text string "signal process*" in 
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the requested document, the requested document is annotated with an annotation to cross reference source 1 . If the 
cross-reference *'URLX1" is stored in the cross reference field 191a. then the document is annotated with "(link to 
CR=URLX1 >" where CR means cross-reference. 

Similarly, if the text "GPS** appears anywhere in the requested document then a link to UR1J(2 is established in the 

5 requested document. The pattern "GPS" is an example of a simple pattern that is a simple text string that does not 
include logical or boolean operators between search pattem segments. By comparison, the pattem "music synthesi*" 
w/10 "signal process^** is an example of a complex pattern which also includes boolean opa^tions and proximity indi* 
cators (e.g. the within ten words "w/10" operator) and the like operators. Various conventional search strategies and 
search engines including strategies involving artificial intelligence and natural language processors may be used in con- 

10 junction with the inventive structure and method and are not descrBaed further herein. 

In some embodiments of the invention, the annotations are defined using hypertext mark up language (HTML). Of 
course, annotations in formats other than HTML may be used. Those having ordinary skill in the art, in conjunction with 
this specification will realize that various syntax may be used in the arviotation. including syntax compatible with con- 
ventional hypertext links and hfTML language protocols. The hypertext link is added to the text in the requested docu- 

75 ment as indicated In FIG. 4 in conventional manner. 

Each of the documents linked via the hypertext link annotations (e.g. source URLX1) are known to contain supple- 
mental information related to the topic of the received document by way of the linked term or phrase (e.g. "music syn- 
thesi*** w/1 0 "signal process*"). 

In these examples, the annotations are hypertext links to other documents; however, the annotations are not limited 

20 to hypertext links and other types of annotatior^ may be added. The annotations* including Hypertext Links, formed 
may be limited in any predetermined manner based on predetermined annotation limitation rules. Such rules may be 
defined by the requesting user, or may an information provider. For exanrtple, certain areas of a document may be selec- 
tively skipped or excluded from the parsing and annotation process when generating matches to the pattern for linking. 
For example, program code areas of a document, or portions of a document that provide examples, or bibliographies, 

25 or any other portions of a document that are readily identifiable may be excluded from pattern matching and annotation. 
In some instances, the document portions to be skipped will be identifiable based on location within the document (the 
titie. or footnotes for example) while in other instances the portions to be skipped may be identified by the characteris- 
tics of the terms themselves (such as courier font type style, upper or lower case, and the like characteristics.) The lim- 
itations may alternatively define portions of the document to be parsed and annotated, or portions of tiie document to 

30 be excluded from parsing and annotation. In some instances, parsing of the entire document may be required. In which 
case annotation of undesired portions may be suppressed after parsing. 

In another embodiment of the invention, a natural language processor is provided for parsing the requested docu- 
ment and determining the grammatical usage of a term in the document. Inclision of such a natural language processor 
would provide means for selectably including only terms used as nouns in the annotation while selectably suppressing 

35 Other grammatical forms (e.g., verbs or adverbs, for example) from annotation. 

Hypertext links may also corrtain a hierarchy of relevance Indicators based on predetermined relevance rules. In 
general tiie relevance indicator may iderrtify the information as having high relevance or low relevance such as a rele- 
vance indicator based on a numerical scale (e.g. relevance from 1-10, v^ere relevance 1 is the highest relevance.) 
In one emt>odimem of tiie invention, any hypertext links present in tiie document at the time of the request will be 

40 allocated a higher relevance indicator than hypertext links added after tiie users request and annotation. 

TTie annotation including hypertext links may be provided in a hierarchical format. For example, when a term in the 
document satisfies the match pattern in the annotation directory, tiie link may reflect a hierarchical cross-reference list 
in order of increasirig specificity such as: "medical", "oncology", "melanoma", "treatment", and "radiation". 

In embodiments of the inventive system and method that Include relevance indicators, the color, font, style, or other 

45 attributes of the text associated with a hypertext link annotation may be altered to show the relevance. A variety of con- 
ventional approaches to altering the color, the font style, and the like attributes of linked terms may be implemented. In 
a further embodiment of the invention, the user may set a threshold during viewing to indicate which relevance indicator 
levels are to be displayed. 

As described above, the annotations added to a document may optionally include a relevance information field 1 96 
so that provides information about the annotation, such as whether tiie annotation was present in the original document 
as requested by the client 102 (high relevance), or whether the annotation was added by the annotation proxy server 
1 18. An indication of the relevance to be assigned is stored in the relevance field 196 in association with each match 
parameter 191a, 191b, 191c. 191d. 191e. After annotation, the document contains an indication of the assigned r^e- 
vance along with tiie annotation. For example as illustrated in FIG. 5, the annotation may include an optional Relevance 
55 Index (Rl) such that when the match pattern occurs in the document, an annotation link is provided ("dink to 
GR=URLX1, Rl=2)") to cross-reference source URLX1 witii a relevance index Rl=2. 

A variety of rules may be invoked by the client 102 and implemented by the annotation proxy sen/er 118 and/or the 
client 102 to provided the desired relevance information. The assigned relevance index of the linked text may also affect 
the attributes of linked terms as they appear on the viewing screen. For example, text linked witii relevance index Rl^^l 
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may appear in red. v^ereas text linted With relevance index Rl=2 may appear in green. 

inanbodiments of the invention where the annotation proxy sender 1 18 is resident on the web :n^0|;^*°" 
lOAlrh^oSd the requested document, the annotation and -^^'^ dSZ^^^^^^^ 
to oenerate a hypertext link annotated document may occur prior to transmission of the document to the Cent 102_ " 
L^SonpT^Srv^llSis resident onadifferert web information se^^^ 

re^ueS2d3^ or^e ciL computer 102 which requested the document, then the <!°«;-«^^^^^^^ 
;S t^?,7re^te APS 1 1 8 for annotation to generate a hypertext link annotated document, which is then transmrtted to 

liTe'l'^ets forth a Pseudocode Representation of Annotation Proxy Procedure. The annotation Proxy Procedure 
may^ude o^nl^* one or more of three sub procedures: (1 ) an Install Cross-Reference Directory subprocedure. (2) 
an Uninstall Cross-Reference Directory, and (3) a Request and Merge Document subproc^ure. 

Te Install Cross-Reference Directory subprocedure is responsible for retrieving and adding a document(DocURL) 
to se?(5S^onSe?(Srertories) used by Annotetion Proxy Procedure. The Uninstall Cross-Reference Directory sub- 
prSd^ret reS^ns^^^^^^ the appropriate installed directories depending upon the value of the DocURL 

pL^meter in S^rSedure call. ?f DocURL = then all of the installed directories are deleted; ottien.ise. only the 

spe^i bX DoTuRL parameter in L subprocedure call. For all items in all installed cross-reference d.rertor^. 
tfetSrSJdure finds orates all text matching a specified pattern and inserts (annotates) ^^'"^-^^^"f . ° 
r4o™?ing document. It then sends the merged document to the requester, where the requestor may be the chent or 

"''c^oS-'itTnrdirectories may originate or be provided by various entHies. ^'-J--"^'-'^^:^^^ f ^Te 
aries may be prepared by information service providers, educational institutons. publishers. 9°°lf«'^"*!f • f "^^ 
Ste foTt.se by a variety of users. Such predefined cross-reference directories are at known URU Cross-refer«^ce 
d^e^^orliZalsrbe generated 

ao2?e5jen?e'ir^o'^^^^ by the diem indude at least two types. Afirsttype of dictionary, retired 

to hSLTs a tiuenc^Tc^ra^e directory." be maintained in a manner that automatically keeps^ack ttie 
r^SriutrtSerSS web pages and the key words associated with their hypertext links. In a second type of dic- 

to Tlrels r?ser LntainaWe directory" the directory may be maintained in a '^""^^^^"•^J 
wi SoSlncludes a link to an optional directory generator 1 1 6 that allo«s the client/user to f^^JJf ^i j^^^^^^^^^ 
V^? bj forlxample instruding the dirertory generator 1 1 6 via the web bra^^^ 

ula^oa^mentrmy personal cross-reference dir^^^ or by editing the match pattern critena if the "f^^ 

S e dSmatSing ^ttem provided in an existing annotation directory Aspects of the two user Qene^^ted d|Ctona..^^ 

rn^ coiSned arJ^either or both may be used in combination with predefined dictionaries created or maintained by 

°*''l^ another embodiment of the invention, the cross-reference directories 112 may be ^1^"^!.^'^^ 
referred to here as a "self-generating directories." In such a self generatng ctoss reference directory 1 2, a directory 
^Jm letsTriid^l^ or in association with a document provider, web information server 104. d-ent compute 
?02^,inotetion proxy server 118. or any other location on network 100 through which documents pass and could be 

links be3een parS;iar temS present in the document and cross-linl^ references ^^-^J^LTuT^sTl^ 
between one document source and another document source generally. The cross-reference dictionary 1 12. 191. 192 
^Sl^i^r^^^ertime as the number of documents read and con^^^^ 

rurrarTa^aTg^rsl'mplemented in the directory generator 1 16 to provide predictability to the automatically gen- 
'"*!2ttf^r:Simentof the invention illustrated inPiai.thedirectory generate 

dient computer 102 This may be the preferred location for constructing a personal user annotation directory because 
SeiinoS^ Z cross references are derived from documents requested by the particular user and the c^ref^ 
TrenTes Se expected to be relevant to the users interests. On the other hand, a directory generator residing else^ 
on Se n^^^o^l 00 that sees a large number of documents is better positioned to construe, a very coj^^e ar^ ^ - 
ardiically deep annotation directory. Such a directory may be somewhat disadvantageous because of rts potental size, 
and may indude cross references that are somewhat irrelevant to a dient computers needs^ 

Ze prderred embodiment that indudes the dictionary generator 116. the "match pattern" «^ J 
encertem191.192 in the automatically generated dictionary is the text forthehyperlm^^ 
Ltemately. the match pattern in the dictionary may be the text for the hypertext link f^us a P^^'f 
oriedino ext (e g the preceding text going ba* to the beginning of the sentence or document section, but not more 
FurtC'mor^^^ procedure 122 in this embodiment inserts annotations even when 
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there is not an exact match between the nratch pattern of a dictionary ttem and the text of a requested document In 
particular, the document merger procedure 122 looks for partial matches, and for each partial or full match that meets 
a threshold match requirement (ag.. a requirement of a match to at least the core portion of a match term) the merger 
procedure inserts a hyperlink annotation that includes a relevance indicator 

The relevance indicator Is assigned a value in this preferred embodiment on a sliding scale such as 1 to 10 (where 
1 represents the highest degree of relevance] based on the closeness of the natch between the match pattem In the 
dictionary and the text of the requested document. Furthermore, the i^er may specify to the merger procedure 122 a 
relevance threshold. When a relevance threshold is specified, only annotations vinth an assigned relevance value equal 
to or higher than the relevance threshold (i e>. with an equal or lower numeric relevance value using the sliding scale 
mentioned aix>ve) are added to user requested documents. As indicated atx)ve, the value of the relevance indicator for 
each annotation can be indicated to the user (A) by displaying the relevance indicator for an annotation when it is 
selected by the user, or (B) by altering a visual characteristic of the text associated with the annotation, such as the 
text's color, font, or style to indicate the value of the relevance Indicator of each annotation. 

The above described "extent of matching" methodology for assigning relevance irxlicators to annotations during the 
document merger process can be applied equally well to the use of cross reference dictionaries provided by third par- 
ties. 

TABLE 1 

Pseudocode Representation of Annotation Proxy Procedure 



Procedure: Install Cross-Reference Directory (DocURL) 
{ 

Retrieve and add document(DocURL) to set of dictionaries used by Annotation Proxy 

Procedure 

} 



Procedure: Uninstall Cross-Reference Directory (DocURL) 
{ 

If DocURL = 

{ Delete all installed directories } 

Else 

{ Delete specified directory(DocURL)} 

} 



Procedure: Request and Merge Document (DocURL) 
{ 

Request and receive document specified by DocURL 
For all items in all installed cross-reference directories: 

{ 

Find all text matching specified pattem and insert cross-reference to 
corresponding document. 

} 

Send merged document to requester. 
} 
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Claims 

1 In a distributed computer system incorporating a plurality of servers used to store documents, each said document 
having a unique document identifier, and a client computer having a browser configured to request and receive said 
documents over said distributed computer system, an annotation system for automatically adding to a requested 
document cross references to other documents, said annotation system conrprising: 

at least one directory of cross references to documents, each cross referenced document having a unique 
source identifier; and 

an annotation proxy configured to form a merged document by merging said requested document from a first 
server with annotations comprising cross references to documents referenced by said at least one directory 
and to relay said merged document to a receiver selected from another proxy or said browser. 

2. The system of claim 1. wherein said annotations are hypertext links defined using hypertext mark up language . 
(HTML). 

3. The system of claim 1 . wherein said annotations are hypertext links, and said directory of cross references to doc- 
20 uments includes entries, each entry comprising: 

a document identifier specifying a document; and 

a pattern, indicating criteria for inserting said document identifier into said set requested document when cre- 
ating said merged document. 

4. The system of claim 3. wherein at least a subset of said entries each includes a relevance indicator, indicating likely 
relevance of said document. 

5 The system of claim 1 . wherein said annotation proxy includes instructions for accepting commands from said cli- 
30 ent computer identifying a set of directories to use when annotating said requested document, and for forming said 
merged document by merging said requested document with annotations comprising cross references to docu- 
ments referenced by said client computer identified set of directories. 

6. A method for automatically adding to a requested document cross references to other documents, said method 
35 comprising the steps of: 

recognizing a request for a stored document by a client; 

transmitting said requested document to an annotation proxy for annotation; 

providing, in association with said annotation proxy, at least one directory of cross references to documents, 
40 each cross referenced document having a unique source identifier; 

merging said requested document with annotations comprising cross references to documents referenced by 
said at least one directory; and 

relaying said merged document to a receiver selected from another proxy or said client. 

45 7. The method of claim 6, wherein said annotations are hypertext links defined using hypertext mark up language 
(HTML). 

8. The method of daim 6. wherein said annotations are hypertext links, and said directory of cross references to doc- 
uments includes entries, each entry comprising: 

50 

a document identifier specifying a document; and 

a pattern, indicating criteria for inserting said document identifier into said set requested document when cre- 
ating said merged document. 

55 9. The method of claim 8. wherein at least a subset of said entries each includes a relevance indicator, indicating likely 
relevance of said document. 

10 The method of daim 6, induding accepting commands from said dient identifying a set of directories to use when 
annotating said requested document, and forming said merged document by merging said requested document 
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with annotations comprising cross references to documents referenced by said client computer identified set of 
directories. 
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